An Open Source Approach to Medium-Term Data Archiving
نویسندگان
چکیده
Mediumto long-term archiving of digital documents, beyond the lifespan of the authoring software/hardware, is a challenging problem. Magnetic and optical media are susceptible to environmental influences and deteriorate over time, often to the point where the archived documents can no longer be retrieved. Previous attempts to address this problem include migration and emulation, both of which have their attendant difficulties. It is the contention of the present study that an Open Source approach offers several advantages. More specifically, by archiving the Open Source application programs (in source code, not executable form) along with the documents in question, in both plain and compressed form, significantly increases the likelihood of being able to retrieve such archives at some future time. The application source code can be recompiled to a form suitable for reading in (Open Source) viewers, thereby presenting to the user the archived document as the original author envisaged it. One set of experiments was undertaken distributing documents together with their (Open Source) authoring software via a Portable Virtual Machine (PVM) program to unused disk space on a network of SUN workstations. The success of this approach was evaluated using the following four measures: (i) lossiness of conversion, (ii) edit-ability, (iii) ability to save back to the original format, and (iv) functionality retention. Another series of experiments was conducted in which artificial (‘speckle’ or salt-and-pepper) noise was deliberately introduced to the archived documents in order to mimic degradation of the storage medium over time. It was found that survivability was heavily dependent on file type: simple text files and MPEG movies were impervious to even 18% introduced noise. Source code programs and JPEG images, by contrast, were intolerant to even the smallest noise levels (it has to be said however that straightforward re-editing of the former led to error-free compilation without much difficulty). Lastly, it was found that decompression (specifically the publicly available RAR decompressor) further enhanced the file recovery process. We conclude that an Open Source approach to the preservation of digital archives has considerable potential. Sherine Antoun, John Fulcher, Carole Alcock
منابع مشابه
Data for the future: The German project "Co-operative development of a long-term digital information archive" (kopal)
Purpose: One of the unresolved problems of the global information society is ensuring the longterm accessibility of digital documents. The project kopal tackles this problem head-on: In a threeyear project kopal’s objective is the practical testing and implementation of a cooperatively created and operated long-term archival system for digital resources. Design/methodology/approach: The system ...
متن کاملExtended Poster Abstract: Open Source Solution for Massive Map Sheet Georeferencing Tasks for Digital Archiving
Scanned maps need to be georeferenced, to be useful in a GIS environment for data extraction (vectorization), web publishing or spatially-aware archiving. Widely used software solutions with georeferencing functionality are designed to suit a universal scenario for georeferencing many different kinds of data sources. Such general nature also makes them very time-consuming for georeferencing a l...
متن کاملStudy of Solute Dispersion with Source/Sink Impact in Semi-Infinite Porous Medium
Mathematical models for pollutant transport in semi-infinite aquifers are based on the advection-dispersion equation (ADE) and its variants. This study employs the ADE incorporating time-dependent dispersion and velocity and space-time dependent source and sink, expressed by one function. The dispersion theory allows mechanical dispersion to be directly proportional to seepage velocity. Initial...
متن کاملEffective Triggers and Barriers of Self-archiving Behavior Displayed by Knowledge and Information Sciences’ faculty members in Iran
Background and Aim: The present investigation was carried out in order to study the self-archiving behavior displayed by Knowledge and Information Sciences (KIS) faculty members in Iran. It intended to discover the incentives and barriers impacting on this behavior as well as arriving at a baseline for predicting the extent of self-archiving. Method: A descriptive survey method was deployed. Th...
متن کاملPriming Effect of on the Enhancement of Germination Traits in Aged Seeds of Chamomile (Matricaria chamomilla L.) Seeds Preserved in Medium and Long-term Storage
Chamomile (Matricaria chamomilla L.) is a widely used medicinal plant possessing several pharmacological effects due to presence of active compounds. In order to study of seed priming effects on seedling growth of chamomile, an experimental design, based on randomized complete design with three replications was conducted under greenhouse conditions in Research Institute of Forests and Rangeland...
متن کامل